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Free energy changes associated with amino acid substitution in 
proteins 



Mansoor A.S.Saqi and Julia M.Goodfellow 

Department of Crystallography, Birkbeck College. Malet Street. London 
WCIE7HX, UK 

The estimation of free energy differences from computer 
simulation of imtcrumolecular systems is important for 
rational strategies for drug design and for protein engineering. 
As an example of one mutation, we have studied the free 
energy change resulting from the conversion of a polar group 
(OH) to an apolar group (CH3) in aqueous solution. We have 
estimated the effect of various local environments on the 
magnitude of the free energy difference and find that 
significant environmental effects are found. We have also 
studied the reliability of the results in detail. 



and phenylalanine, comparison with data on representative model 
systems for the sidechains (Wolfenden et ai, 1981; Ben-Nairn 
and Marcus, 1984) shows good agreement. 

Bash et al (1987a) have performed a series of free energy 
calculations involving many mutations of amino acid side chains. 
The encouraging results of these simulations have prompted 
similar calculations on more specific biological systems, such 
as free energy of binding and of activation for catalysts of a 
tripeptidc substrate by native subtilisin and a subtilisin mutant 
formed by changing Asnl55 to Ala l55 ; XRao ^ a/, 41987). 
Although only.small differences, are fotind-jn 'me^bin^mg^free 
energy, the corresponcfing catalytic free energy is substantia; and 
in good agreement with experimental results. This method has 

?^K^?^^y s ffiwiAjwtfin^ibitors (Bash etab. 1987b). 
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1 'anSjbi^aiiUai' 



Itr biological systems (Jprptsenr;1989;;feverage ;f pforfete^ 
tfai.1989). Such^alcMlaticms have been describe^^ 

raU of theomic^chemistry^fertman amKblirnB, ^ r " i n aMifion'to Emulations o^riecific^ ' 3 



ft $ 'Uie^Hbly Grail 
... , \ ?89ji} |f Whilst the stotistical mechanics'of the^FEP methodfare 

m^': 1 V- » 

■ ! ' •$ 1 •;■}; ^ij^spite the'ir limitations* 1 free ener£#cd 

• 5 ■ l ;'/aW^w^an^&ed t much attention b*™ 
'■ : ^x^rSnen^^in^.protein ^engineering; 

v* ■ v - such' ^n^rite 
-f ;1985)?*e»chaflge>i 

• >■} > r • -;'andVso^estiniation 

^c^lir^irfp^l^ 
' 3Th^jF§P r mVih6d'has been used to studynhe; 




^f^PF > '^?! lpd,h ^ be ^ uieato s^the"^ 

S ^^ ^ ti0 ^ m f h , . ^%:#«^S^*^ : W a methanty-^^ 

;;)^ger change; jhat of CmOH to CH 3 CI#; (Jorgensen'and 1 ' mutation and a threonine* t6 Wine' mutation mw^a-fhVAla' *¥* § 



RaVimonan, 4985): This interconversioh of methanol ( tb' ethane 1 
in* aqtteo^s^itibri has n6^ ; becdm'e > one dftfie stahdard -tests" 
V - v ttf tfree :e'riergy^simulati6n methods. Th6 ' calculation of Tree 
' x > 'energies>fdr mutations involving large perturbation^ introduces 
.a number of problems. The mutation cannot' be carried oWjin 
' one step" arid must be broken down into several steps, and (the 
Tree energies accumulated to give the total free energy change. 
The exact method and the length of the simulation at each interval 
can affect the final answer. 

Singh et'at. (1987) have studied, using molecular dynamics, 
a large structural change, viz. the interconversion of phenyl- 
alanine to alanine in aqueous solution. Although there are no 
experimental data for the free energies of solvation of alanine 
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mutation and a threonine r t6^vaJine mutation ih W/Ja-thrr^la' ; '? 

: trir^ptide"and • * | 

a helicB c^^ w . : -| ' 
Vecisibn v anid^ | v '^ v ^ . | 

Computational procedures , • * , . 3 

The difficulty in estimating free energies from simulation • 3 

calculations is well known (Mezei and Beveridge, 1986) and' \ 

arises from the fact that the free energy is not an ensemble ^ 

average. The free energy difference between two states i and j * 

can. however, be expressed in terms of an ensemble average by j 



Fa *= kT In <e^> ; 



(t) 
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Table 1. Methanol to ethane mutation 



Coupling parameter 



0 


0.125 




-2.086 ±0.133 


0.125 


0.25 


2.2(2 ± 0.123 


-I.85S ± 0.094 


0.25 


0.5 


2.263 ± 0.213 


-1.858 =fc 0.097 


0.5 


0.75 


0.835 ± 0.091 


-0.926 ± 0. 102 


0.75 


1.0 


0.282 ± 0.092 





The forward (fy ami reverse \F$ free energies iti kcal/mul. Equilibration 
was for 400 000 Monte Carlo steps and data was collected in batches of 
50 000 for the next 1 .0 x 10* steps. 

where / is an initial state, j is the final state, 0 is MkT and 
Ua - Uj - Ui, where V { and (/ y are the configurational energies 
of states /' and j respectively. 

Whilst equation (1) is exact, a meaningful estimate of the 
ensemble average can only be made if the perturbing potential 
Uy \s km&li^e^ensemble average is then given- by 



(2) 



The reason- that the perturbing potential (Cy) -has. to be small is 
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Fig. J. Convergence profile for the Ala-Thr-Ala - Ala-Vul-Ala muiation. 
Batch averages of free energy are plotted for two (N. V, T) simulations 
|, -wi^ (N. P, T) simulation. The data qre, 




^ precisely^ exp^ (-(S^jl^^ the..:. -{jor^ns^^n^ 

({^c^^u^^^Kli contribute mo& to jh^ ;atom apRr^ach: i^J^^^y^'^^r^^t^M^^ to^krj^;are^niot5 ; ; V ! 

■fyi&Kiffi^fc the ^i^f^ij^t^l^^ repres^t^a^cxpHcH ; pV ;yyate> irwiecule&^is^>! ; 

a& The free enwgy ' change .:is^ how WtimateU.5usihg,. the, partial^ volumes ' forjthe ^sojute^i ,^ 

quantity. 'and so the^ total "free energy, ciiange^^tweeri. an initial ; | . (Zamytnin, 49f$). an^ they are Modelled •usihg;t^$i^P'ij^):'.-^ v . 
(OJand.^ *^<Jp^^H^ ■* .■ ^wri^ 'Zv<^ 



v" i(tteryap.or^iidp^ size, of each jWindow^are. importMt, , -j 'oxygen ^rr^j^pii^^cir^ejM^ into^the QH^grqvip of . ethane, ^ 

^i^i^-^istic-results because the free energy change ; does, J : and the hy^|og^n ^of nieth^ cSp 
>. nbt^v^llineMlylvyith X: \ ' ; 7- v ' Xv- ./^bbnd Jengdiof j^j^r v ^etool * 1 * v ~ J v 



Thfe* hybrid hamlitoniari is described by 



-43) 



1 grow tothe CtC bond length 
of I.'53*for ethane. G.4 x jf^ MG-steps of equiVibration are . . / 
$ ;> used ( and; data .are collected in ibatches of 50 000 fon the next 

1 x/io 6 .^ steps. ^C-.^.-v^.i 

, t T , A simitar, protocol is used for* the N, V, Tsimulationof the , 

.^•and the geometr^ofjrJte^ystem is also varied in a sirnilar fashion,, ] . mutation ofjhreprjine \p : y^Une„ w 4 ithin!an ^la^rhr-Ala triper> j; yv 

/ Our Monte toiasM **. tide in the { a ^^a^c^f^^&Man^wUh : lfui-J^ ^iM'fig'ferfijrtV'^'f : 

, ; 28 (at.^ .y^verf^ Compme^CentreJ'u^m .the set at p^i^tSg^eiat < 1987).- The;pHof tfe^eoninic ^ ; 

; prograrrrBQSS (W.LJprgensen, Purdue University,; Inuiana)^ ;'. side^hairtjis m^ateid ^ i^hb;^, i( i 
.which mcludes doubl^wide preferential sampling iid a featheret! ibo^^gr^^i^rthe.^e^j^.pf the^system !rjpfjriaihs !fixeii ^6r . , • 
cut^ffjbe^ween, 8.j)'and '8.5. Double-wide sampling ,means Quit A , m jfiuf systeni, t;heiinput{ : ta ,u)^ BQSSi is-ro^i^'^jilt^^j^J, i 
: -ft«e energy differences i for X- - X,- +(\ [ and -X, ^^Xj r^/ijcan fxJ .cobrdinates^jniuai.a^^^ 
obtained from one simulation, since in both cases* the sampling ^ dircctjy.taihe prpgra^^ 

input.^ v < :\:<\-h\> - 

We repeated the v calculation , fqr t ,the first two ; windows \ t 
X = 0 - X = 0.125 X,= 0.25( for, the- threonine to. valine 
mutation in the tripeptide by changing the ; starting seed;. This 
allows the system to take a different walk through phase space, 
and helps to give some indication of the adequacy of sampling. 
We also performed an NPT simulation for this window. 

Weahcn ran an N, P, T simulation of the mutation of threonine 
to valine but this time as part of the pentapeptide Ala-Lys-Thr- 
Lys-Ala again in the helical conformation. The torsion angle, 
X, for lysine was taken as 287° (McGregor et a/., 1987) and 



is based on the X; distribution. 

In this study we use five windows, with X * OA 0.125, 0.25, 
0.50, 0.75 and 1.0 for both the methanol/ethane mutation and 
the threonine/valine mutation. An important check on the 
precision of the simulation is to compare and Fy r Ideally, 
- The discrepancy between the forward and 
backward (F fi ) simulation is known as the hysterisis and is due 
to inadequate sampling caused by too large a perturbing potential. 
Observation of the hysterisis effect for each window enables us 
to highlight the difficult windows, which should be sub-divided 
into smaller intervals. 

420 



Free energy changes associated with amino acid substitution 





,. . ( j ft .'f..V. a .v 2. ;i T(w,^-LyvThr-Lys-A!a pej|t^peptidc.^c;4hrcpmnc residue. is ; rnuiated to a .y^ne y) i 



.? . , k vfi U(hc* rema in ing ^Xs^'HCt at 1 80"^;% . xfbr^'thi^iinb \w%fixcd' at i - >uhc^hysicrcs)s;;6ffctts arc, fairly sjhalj, far ^ii^hnys^r^'aiU/.f 4 jf 
■ ' > ' ' S> ;* ^j&i&A ' c%'" . I . . ; t ; \^:as* ib64forward». and'. ( reverse'- m|gnitude' ofpfte'-iffee .energy 
■ i/-*; - 1; : i ; ,r . * , ; ' difference agree within A " ' nrt - 



Tlies free - erierg'y tii (Terences' for trie windows-used in the ( mutation 
of methanol to ethane are given in Table Initial 400 K 
configurations are carried out for equilibration and 20 batches 
of 50 K configurations are used to calculate the average free 
energy change (and standard deviation) for each window. The 
total F'm the forward direction X » 0 — X = 1 is +8.68 kcal 
mol ~ \ i.e. it is unfavourable as a polar hydroxyl group is being 
replaced by a methyl group. The reverse F is -7.98 kcal/mol. 
These values assume no hysteresis for the first and last windows. 
Thus, the average magnitude of 8.3 kcal mol" 1 should be 
compared with an experimental value of 6.9 kcal/mol. Moreover, 



o r0,44jtcal/niol. Tbe first window 

X>==; 0 J25) ! produce!- the ; largest change < in free 
' energy ,?*fconsistent;- with * previous \ studies? (Jorgenseiw* and ■ 
Ravimohan, 1985). * , ".'j * S t v ^ ' 

A similar hydroxyl to methyl mutation is involved in the 
conversion of a threonine to a valine residue within a tripeptide. 
■ A similar number of windows are used but the average free 
energy change per window is obtained from 25 batches of 50 K 
configurations. The total forward F is 6.85 kcal mol" 1 . Again 
this is.positive because of the unfavourable change of an hydroxyl 
to a methyl group. The total reverse F is -7.20 kcal/mol. The 
magnitudes of these numbers agree within 0.35 kcal mol' 1 . The 
average magnitude is 7.05 kcal/mol, compared with an average 
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magnitude of 8.33 kcat mop 1 found in the ethanol to methane 
mutation. 

In order to test the precision of these estimates, further 
calculations are undertaken on this threonine to valine mutation 
within a tripeptide. These tests are only performed on the first 
windows with X = 0,125 - 0.0 andX = 0.125 - 0.25. Using 
a different starting seed for the random number generator, values 
of -2.28 ± 0.13 kcal/moi and +1.56 ± 0.11 kcaVmol are 
obtained for the \ = 0.125 - 0.0 and X = 0.125 - 0.25 
windows respectively. These should be compared with the 
previous values of -2.56 ± 0.14 and + 1.85 ± 0.12 kcaUmo] 
for the same windows. Thus starting with a new seed produces 

; .. free energy changes which are slightly outside the sum of the 

. errors. 

We have also used the N, P, T rather than the N, V, T 
^ensemble. This produces free energy differences of -2.86 ± 
*i 0.08. kcai/mol and +2.31 ± 0.07 kcaVM for windows 
"1X< = 0.125 - 0.0 and X = 0.125 - 0.25 respectively. The 



Table II. Threonine to valine mutation in Aia-Thr-Ala tripeptide 



Coupjing parameter 
X, X, 



N. V. T ensemble 



0 

0.125 
0,25 
0.5 
0.75 



0.125 

0.25 

0.5 

0.75 

1.0 



N. P. T ensemble 



0, 

0,125 



0.125 
0.25 



.New seep (N. V. T) 



• ^|^i|^pfi';ll\ese average numbers which are each produced; jM^'.-jffi 
y 50 K configurations is better than that obtained • 3lM^ 



0.125 



1.854 ± 0.120 

1.763 ±0.163 

0.359 ± 0.123 

0.342 ±0.132 



2,309 ± 0.067 



1.558 ± 0.J 08 



-2.561 ± 0.135 
-1.368 ± 0.088 
-1.809 ± 0.237 
-1.122 ± 0.134 



-2.856 * 0.084 



..t2.277 ±0.133 



7rom KirV^ T'ensembles. Moreover, there are differences of 0.3 
and 0.46 kcal mol" 1 between the same estimates obtained in the 



T^'j^^t'i^ Mfrcverse if),) Tree energies in kcal/n$|}.'An '. ■ 
oqu iUtHsXiLHi* < rur 206 k OOO Monte Carlo steps is performed initially at 
constant volume, since undesirable volume expansion can occur to relieve 




hges in procedure (e.jj. 'a new starting.seed in> Monte Carlo: . Pettitt no;jced that sometimes the total ifinalakwei^areiin 

: or a small change in a dynamics 'tojec&ry^can lead to)}' accord ttt.^ 
appreciably different answers. This k panicMarly true for larger]'- '-M^-uk -»~^ — — 

V. .^ystejps/ wherethere wilibe a more.compiicatedeherg^ 
,u) . and 1 more Joc^ toima. iThe fact that ndt^ail^regions of phase,. 



better* 
the end 



. space are accessible^ which resulis .in! incomplete sampling,^, 
remains a problem with ; free energy simulations. Adaptive!, 
importance sampiing^Mezei, 1987) may be of use in this regard.- 
We have carried out a free energy 'window' perturbation 
simulation for the mutation of an OH group to a CH 3 group in 
three different local environments. The same window intervals 
are used in all cases; While the agreement between forward and 
backward run is satisfactory for the methanol/ethane mutation, 
there are large hysteresis effects for window 4 (see Table II) in 
the Ala-Thr-Ala tripeptide. For the pentapeptide there are 
significant hysteresis problems for windows 2, 3 and 4. In this 
case, the hysteresis for window I , which accounts for the largest 
free energy change, is quite small, although outside the error 
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points; 'although; in prjncipleteny particular time ? cpuldrmve teen;r 
the^hysica^end^oiht.^Pcrhaps a good n^ea^Wre-pixOV^jtieiWi , ^ 
.ishotlld; ^><hcfiar;gesUdiscrepancy. :behye^ ; ^^^^ra«y ,.Cv 
• windpwi, -rii'."^^; * , ^r? ^v-'.-iiv 

For :thc' tripepUdcv.compan.son of an N IV V, ^simulation with: \ » \- 
two different\startjngi t see<ls;?and an N, R, T 'simulation show : 
reasonable agreement (Figure I), although the averages are just 
outside the error bars. Figure 1 suggests that the agreement is 
improving with the length of the simulation. 

Despite their tremendous potential for biomolecular simulation 
free energy perturbuation calculations do suffer from the so called 
multiple minimum problem. Detailed studies on simple systems 
reveal how robust the method is and allows problems to be 
highlighted, so the techniques can be used with greater confidence 
on larger biological systems. Our results suggest that the 
magnitude of the free energy of mutating an OH group to a CHj 
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group in aqueous solution is dependent on Che local environment, 
as it becomes less, on going from the methanoi/ethane system 
to the Ala-Thr-Ala tripeptide and finally the Ala-Lys-Thr-Lys- 
Ala pentapeptide. The accuracy of the results for the more 
complicated systems decreases. A different choice of window 
intervals may lead to better sampling: for the larger systems and 
it would be interesting to. use the recent method of dynamically 
modified windows (Pearlman and KpHman, 1986b) for these 
systems. Alternatively Ipnger simulatipns for the windows 
showing large hysteresis effecte may be necessary. 

Comparison between bur free vy energy simulations and 
experimental data is important. For this reason, we initially chose 
to look at the ethane to methanol transition for which experimental 
data exist. There appears to be little experimental data on changes 
, in proteins with which we can make a direct comparison,' The 
work of Fersht and co-workers (Keltis et ai , 1988; Serrano. and 
Fersht, 1989) and Alber et ai (1987) on the* thermodynamic 
. stability of mutations to the proteins barhaseiand phageST4 
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^^expmmental results refer to changes esSntially within the protein* 
and not in regions with large accessibility to solvent. 
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